Keir Fraser [Fri, 4 Sep 2009 07:42:10 +0000 (08:42 +0100)]
xend: Support "bootloader" mode for "drbd:" devices
To be able to use "bootloader" on drbd devices the following changes
need to be made:
*) Translation of devicename
_parse_uname which is used by blkdev_uname_to_file which is again used
by _configureBootloader in XendDomainInfo needs to be able to resolve
drbd resources to the corresponding blockdevice to feed to the
configured bootloader.
*) Activation of drbd device
If the drbd device isn't in Primary mode when the bootloader tries to
fetch the kernel and initrd, the start of the DomU will fail. To
prevent this the given drbd device will be made Primary before the
bootloader gets executed.
A note on the naming of drbd resouces: drbd uses mostly resource names
in it's userland tools. Because of that drbd VBDs, if configured with
the "drbd:" type, should always use the drbd resource name as
suggested by the drbd documentation at
http://www.drbd.org/users-guide-emb/s-xen-configure-domu.html. My
patches assume that the VBDs are named accordingly.
Signed-off-by: Michael Renner <michael.renner@geizhals.at>
Keir Fraser [Fri, 4 Sep 2009 07:34:45 +0000 (08:34 +0100)]
xend: fix domain_migrate
When the guest(pv-on-hvm guest that cannot suspend) reboot in
LiveMigration, the disconnecting of src-side is not transmitted to
dist-side. As a result, the error processing on the dist side is not
executed.
Signed-off-by: Tomonari Horikoshi <t.horikoshi@jp.fujitsu.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 3 Sep 2009 08:51:37 +0000 (09:51 +0100)]
vt-d: fix Dom0 S3 resume.
When resuming from Dom0 S3, here 'irq' is -1, so we can't use it at
all. We should always use iommu->irq.
With the patch applied on the current tip 20153 and using the 2.6.18
Dom0, Dom0 S3 works fine (at least on my DQ35).
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Keir Fraser [Thu, 3 Sep 2009 08:50:46 +0000 (09:50 +0100)]
x86 vpt: Small performance fixes.
1. once one-shot timer is fired, IRQ is raised repeatedly forever.
2. Test pending_intr_nr before pt_irq_masked(), as it is cheaper.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Thu, 3 Sep 2009 08:49:41 +0000 (09:49 +0100)]
xm: Add "tap2" to attach blocktap disks to VM
I detected a problem when using XenAPI. When I started a VM by
using xm create command, blocktap disks were not attached to the
VM.
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Thu, 3 Sep 2009 06:37:27 +0000 (07:37 +0100)]
x86: com devices's irqaction shouldn't free.
Since irqs of serial devices are initialized in early Xen and
its irqaction is not allocated from heap, so doesn't need free
in release irq logic.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 3 Sep 2009 06:29:29 +0000 (07:29 +0100)]
[IOMMU] dynamic VTd page table for HVM guest
This patch makes HVM's VTd page table dynamic just like what PV guest
does, so that avoid the overhead of maintaining page table until a PCI
device is truly assigned to the HVM guest.
Signed-Off-By: Zhai, Edwin <edwin.zhai@intel.com>
Keir Fraser [Wed, 2 Sep 2009 15:15:05 +0000 (16:15 +0100)]
libxenguest: Remove unused static inline function is_loadable_phdr()
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Wed, 2 Sep 2009 15:12:41 +0000 (16:12 +0100)]
Enable some SCSI drivers in pvops kernel config
Enables a couple of SCSI host controllers which are found in our test
farm but not enabled in the default upstream kernel. The new drivers
are compiled as modules which is pretty harmless so this should be
safe.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Wed, 2 Sep 2009 10:40:04 +0000 (11:40 +0100)]
x86: Remove the redundant logic in set_msi_affinity
Remove the redundant logic in set_msi_affinity. And it is introduced
accidently, maybe something wrong when I generated the patch.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Wed, 2 Sep 2009 10:39:27 +0000 (11:39 +0100)]
xm: Make cpu_{cap|weight} available when using XenAPI
Currently, cpu_weight parameter and cpu_cap parameter in domain=20
configuration files are ignored when using XenAPI.
The parameters are available by this patch.
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Wed, 2 Sep 2009 10:39:02 +0000 (11:39 +0100)]
x86: rdtsc emulation (PV and HVM) must be monotonically increasing
The Intel SDM (section 18.10) clearly states that rdtsc
returns a "monotonically increasing unique value".
Current emulation code for rdtsc (both PV and HVM) returns
only a monotonically-non-decreasing (non-unique) value,
so ensure stale value is always incremented.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Wed, 2 Sep 2009 10:38:24 +0000 (11:38 +0100)]
pygrub: Match bare-metal GRUB behavior for passwords
The password support patch already merged didn't match the bare-metal
GRUB behavior so I created a patch to match it. If password is entered
in grub.conf file, pressing `p` is required exactly like when using
"real" (bare-metal) GRUB. New options are available after the correct
password is entered.
Signed-off-by: Michal Novotny <minovotn@redhat.com>
Keir Fraser [Tue, 1 Sep 2009 10:36:51 +0000 (11:36 +0100)]
x86 hvm: remove pt_reset()
Virtual platform timers are not sync'ed with guest's TSC any more
since c/s 17716. Thus pt_reset is now useless.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Tue, 1 Sep 2009 10:36:16 +0000 (11:36 +0100)]
x86 passthru:: graphics passthrough
This patch supports basic gfx passthrough on xen side:
- add a VGA type for gfx passthrough, and get the size of VGA bios
of passthrouged gfx in hvmloader
- add a config option 'gfx_passthru' for gfx passthrough
Signed-off-by: Ben Lin <ben.y.lin@intel.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Tue, 1 Sep 2009 10:34:31 +0000 (11:34 +0100)]
x86: Make the hypercall PHYSDEVOP_alloc_irq_vector hypercall dummy.
This patch tends to make the hypercall PHYSDEVOP_alloc_irq_vector
dummy, and defer vector allocation to programe ioapic entries by
dom0. Basically, dom0 shouldn't touch vector namespace which is only
used by hypervisor for servicing real device's interrupts. And this
patch also makes broken NetBSD dom0 work again.
Signed-off-by: Xiantao Zhang <xiantao.zhang.intel.com>
Keir Fraser [Tue, 1 Sep 2009 10:32:47 +0000 (11:32 +0100)]
[IA64] Further irq-vector fix.
Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Mon, 31 Aug 2009 17:17:26 +0000 (18:17 +0100)]
xend: Fix c/s 20137 -- do not redefine built-in name 'str'.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 31 Aug 2009 09:54:32 +0000 (10:54 +0100)]
x86 hvm: Clean up VLAPIC interfaces a little, and fix vlapic_ipi().
A boolean flag was overflowing a uint8_t.
Thanks to Dongxiao Xu at Intel for tracking down the bug.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 31 Aug 2009 09:17:09 +0000 (10:17 +0100)]
[IA64] Fix serial console freeze issue
20110:
6e83b0ec2d70 is incomplete. irq_to_vector() is still required,
otherwise the serial console freezes without sync_console.
I confirmed that dom0 booted up without sync_console.
Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Keir Fraser [Mon, 31 Aug 2009 09:14:26 +0000 (10:14 +0100)]
libxc: Avoid a constant-zero-sized memset().
Some environments warn about this, which fails the build.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 31 Aug 2009 09:12:10 +0000 (10:12 +0100)]
xend: Greater verbosity on domain creation failure
Attached patch makes error reporting more verbose when
xc.domain_create() fails or raises an Exception.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Mon, 31 Aug 2009 09:10:17 +0000 (10:10 +0100)]
x86/numa: fix c/s 20120 (Fix SRAT check for discontig memory)
That change converted the (wrong) assumption of contiguous nodes'
memory to a similarly wrong one of assuming discontiguous memory (i.e.
each node having separate E820 table entries). The code ought to be
able to deal with both, though, and I hope this change makes it so.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Alex Williamson <alex.williamson@hp.com>
Keir Fraser [Mon, 31 Aug 2009 09:09:12 +0000 (10:09 +0100)]
properly __initdata-annotate command line option string buffers
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Mon, 31 Aug 2009 09:08:38 +0000 (10:08 +0100)]
x86: properly __init-annotate time.c
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Mon, 31 Aug 2009 09:06:53 +0000 (10:06 +0100)]
introduce size_param()
With there being several instances of custom_param() where the handler
is just invoking parse_size_and_unit(), it seems to make sense to
introduce a simplifying abstraction.
Also fix serial_txbufsz not having been guaranteed to be a power of
two.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Mon, 31 Aug 2009 08:54:25 +0000 (09:54 +0100)]
x86_emulate: honor failure of in_longmode()
Failure of in_longmode() shouldn't be treated the same as the function
returning 'true'.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Mon, 31 Aug 2009 08:51:45 +0000 (09:51 +0100)]
x86, ept: remove execute permission for granted pages' P2M entries
When backporting c/s 20026 I noticed that granted pages get execute
permission, which doesn't seem desirable (and has been avoided for PV
guests for quite a while).
Even for p2m_mmio_direct is seems suspicious to allow execution, but
me being less certain here I left it as is for the time being.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Mon, 31 Aug 2009 08:51:05 +0000 (09:51 +0100)]
Adjust non-default sized console ring allocation
Using xmalloc() for objects that are guaranteed to be at least as
large as a page is wasteful, as it will always result in more (here:
double the amount) being allocated.
The other adjustments are more cosmetic:
- Updating conring and conring_size can be done so NMI/MCE generated
messages don't use the new (larger) size with the old (smaller)
buffer.
- The size printed can be in KiB (for the value to be easier to grasp)
since it is always a multiple of the default of 16KiB.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Mon, 31 Aug 2009 08:47:30 +0000 (09:47 +0100)]
x86: fix get_free_pirq
GSI should not be allocated for other purpose, so change
the hard code limit.
Also fix the out of loop checking, it should be '<' instead of
'=='.
Signed-off-by: Qing He <qing.he@intel.com>
Keir Fraser [Thu, 27 Aug 2009 10:25:34 +0000 (11:25 +0100)]
x86: softtsc for PV domains
Implement softtsc (TSC emulation) for userland code in PV domains. It
currently is tied to the existing "softtsc" Xen boot option (which
does the same thing but for HVM domains). Later it should be tied to
a vm.cfg option, but this is sufficient for now to obtain performance
degradation data for PV environments that heavily utilize rdtsc. To
record emulation frequency, use debug-key "s".
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 27 Aug 2009 09:13:13 +0000 (10:13 +0100)]
x86: fix msi_free_irq().
1) We should invoke destroy_irq() before msix_put_fixmap().
2) destroy_irq() invokes mask_msi_irq() eventually, so we can remove
the duplicate mask operation in the 'if' statement here.
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Keir Fraser [Thu, 27 Aug 2009 09:12:41 +0000 (10:12 +0100)]
[HVM] add super page support for HVM migration
This patch try to allocate 2M pages on target side based on analysis
of pfn sequence sent from source side for HVM migration.
The algorithm is: If pseudo-phys page is not yet populated in target
domain, AND it is first page of a 2MB extent, AND no other pages in
that extent are yet populated, AND the next pages in the save-image
stream populate that extent in order, THEN allocate a super page. If
the next 511 pages (to make the 2MB extent) are split across a batch
boundary, we have to optimistically allocate a super page in this
batch, and then break it into several 4K pages in the next batch,
which is speculative.
This patch is also friendly to PV guest migration.
Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com>
Keir Fraser [Wed, 26 Aug 2009 14:41:59 +0000 (15:41 +0100)]
xend: Do not pass pointer to a 16-bit domid_t to PyArg_ParseTuple()
when it expects a full integer.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 26 Aug 2009 14:35:14 +0000 (15:35 +0100)]
xend: Flask MLS security label handling
Changed the way security labels are handled to allow domains to be
labeled with Flask MLS security labels. Changed the error message
generated when an invalid context is submitted to be more useful.
Signed-off-by: Machon B. Gregory <mbgrego@tycho.ncsc.mil>
Signed-off-by: George S. Coker, II <gscoker@alpha.ncsc.mil>
Keir Fraser [Tue, 25 Aug 2009 15:26:02 +0000 (16:26 +0100)]
stubdom: Backport fix for SIZE_MAX from newlib 1.17.0
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 25 Aug 2009 14:36:37 +0000 (15:36 +0100)]
Accurate accounting for credit scheduler
Rather than debit a full 10ms of credit on a scheduler tick
(probabilistic), debit credits accurately based on time stamps.
The main problem this is meant to address is an attack on the
scheduler that allows a rogue guest to avoid ever being debited
credits. The basic idea is that the rogue process checks time (using
rdtsc) periodically, and yields after 9.5ms. Using this technique, a
guest can "steal" 95% of the cpu. This is particularly an issue in
cloud environments.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Tue, 25 Aug 2009 13:59:09 +0000 (14:59 +0100)]
xend: Fix typos in configure_vtpm
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Tue, 25 Aug 2009 13:58:42 +0000 (14:58 +0100)]
x86 numa: Fix SRAT check for discontig memory
We currently compare the sum of the pages found in the SRAT table to
the address of the highest memory page found via the e820 table to
validate the SRAT. This is completely bogus if there's any kind of
discontiguous memory, where the sum of the pages could be much smaller
than the address of the highest page. I think all that's necessary is
to validate that each usable memory range in the e820 is covered by an
SRAT entry. This might not be the most efficient way to do it, but
there are usually a relatively small number of entries on each side.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Keir Fraser [Tue, 25 Aug 2009 13:58:07 +0000 (14:58 +0100)]
xen/xsm/flask: Fix Flask MLS context generation
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Keir Fraser [Tue, 25 Aug 2009 13:57:45 +0000 (14:57 +0100)]
pygrub: Set path in #! line of pygrub, too
pygrub currently has a hardcoded path of /usr/bin/python which is not
correct if the version of python at install time is not the same as
that at build time. This patch uses the existing install-wrap and
python/get-path machinery.
(It does not address the currently-existing bug that the get-path
machinery works by assuming that `python' is a symlink, rather than
querying the python interpreter for its version.)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Tue, 25 Aug 2009 13:56:54 +0000 (14:56 +0100)]
xend: Add support for URI ('file:' and 'data:' scheme) for PV/kernel
and PV/ramdisk
Add support for 'file:' and 'data:' URI schemes for the parameters
'PV/kernel' and 'PV/ramdisk' in the VM.create() call. The 'data:'
scheme handling enables using a file which is stored inside the
management system (from where the XenAPI call is send) as kernel or
ramdisk.
Notes:
o all included: a detailed description can be found in the xenapi
documentation
o bumped up the version of the API document to 1.0.8 (because of
(minimal) interface extension)
o Future enhancements (like http:, ftp: schemes) fit seamlessly into
the current design / classes
o Unittest cases and xm-test case included
Signed-off-by: Andreas Florath <xen@flonatel.org>
Keir Fraser [Mon, 24 Aug 2009 07:27:30 +0000 (08:27 +0100)]
libxc: More LZMA/BZIP fixes.
- Fix an error message in xc_try_bzip2_decode()
- Check library installation on demand using a Makefile function,
rather than generating a dependency file. Cleaner and avoids a race
on generating the dep file.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 24 Aug 2009 07:08:52 +0000 (08:08 +0100)]
vtpm: Upgrade to using tpm_emulator-0.5.1
The newer version of the emulator contains several bug fixes, one that
we were seeing in our use of vtpm.
This patch also defines TPM_STRONG_PERSISTENCE for the new emulator.
A couple of important notes about this patch:
-This has only been tested on PVM domU's. In theory it should work for
HVM but I have not tried it at all and can guarantee nothing.
-All the relevant changes in tools/vtpm/vtpm.patch have been ported
to tpm_emulator-0.5.1.
-None of the changes in tpm_emulator.patch have been ported. In
particular this means the BUILD_EMULATOR option, which as I understand
lets you use the tpm_emulator in dom0 for a machine that does
not have a real hardware TPM does not work. This functionality should
be easy to add though because the new emulator already comes with a
kernel module interface.
-No considerations were made for the VTPM_MULTI_VM feature (which is
supposedly unfinished). This patch may or may not break any progress
made on that feature.
Signed-off-by: Matt Fioravante <Matthew.Fioravante@jhuapl.edu>
Keir Fraser [Mon, 24 Aug 2009 07:05:46 +0000 (08:05 +0100)]
xend: Allow vtpm instance uuid to be specified on domain creation
Right now xen will create a new vtpm instance everytime you start up a
domU, even if you specify the instance parameter in your config file.
Each vtpm instance is then given a uuid and the vtpm.db file maps
instance numbers to uuid numbers.
This patch is a hack that lets you explicitly set the uuid of your
vtpm instance. Everytime you boot up your domU now the vtpm will get
that uuid and thus it will always get the same vtpm instance number
instead of being generated a new one.
So for example, in your config file you would do something like this
vtpm = [ 'backend=0,uuid=
dcdb124b-9fed-4040-b149-
dd2dfd8d094c' ]
Signed-off-by: Matt Fioravante <Matthew.Fioravante@jhuapl.edu>
Keir Fraser [Mon, 24 Aug 2009 07:03:46 +0000 (08:03 +0100)]
vtpm: Fix hashed-memory file writing.
There is a bug in the vtpm_manager that has to do with hashing and
saving the NVM memory files (vtpm_dm_%d.data). The file is not
truncated when it is written and this results in the hash becoming
invalid because of the extra bits at the end of the file.
This patch adds O_TRUNC to the flags when opening the file.
More details on this issue are in the bug report on bugzilla=20
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=3D1488
Signed-off-by: Matt Fioravante <Matthew.Fioravante@jhuapl.edu>
Keir Fraser [Mon, 24 Aug 2009 07:02:08 +0000 (08:02 +0100)]
x86: run timers when populating Dom0's P2M table
When booting Dom0 with huge amounts of memory, and/or memory accesses
being sufficiently slow (due to NUMA effects), and the ACPI PM timer
or a high frequency HPET being used, the time it takes to populate the
M2P table may significantly exceed the overflow time of the platform
timer, screwing up time management to the point where Dom0 boot fails.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Fri, 21 Aug 2009 16:14:35 +0000 (17:14 +0100)]
x86: Ensure irq is disabled before taking vector_lock.
Fixed debug lock issue for taking vector lock.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Fri, 21 Aug 2009 16:13:54 +0000 (17:13 +0100)]
ia64: Fix ia64 build issue introduced by per-cpu vector changes.
ia64 has no per-cpu vector support, so change the related APIs back
through defining macros.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Fri, 21 Aug 2009 16:13:17 +0000 (17:13 +0100)]
Update .hgignore for tools/libxc/.zlib.deps
Keir Fraser [Fri, 21 Aug 2009 16:12:13 +0000 (17:12 +0100)]
docs/misc: Update XSM Flask documentation
Update the XSM Flask documentation to reflect the support for
policy.24, the updated policy and policy build infrastructure, and how
to enable the optional MLS policy.
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: George S. Coker, II <gscoker@alpha.ncsc.mil>
Keir Fraser [Fri, 21 Aug 2009 16:11:40 +0000 (17:11 +0100)]
pygrub: Fix elilo handling after password patch.
Signed-off-by: Michal Novotny <minovotn@redhat.com>
Keir Fraser [Fri, 21 Aug 2009 16:00:01 +0000 (17:00 +0100)]
Revert 20105:
979fd420311b
Keir Fraser [Fri, 21 Aug 2009 10:10:49 +0000 (11:10 +0100)]
libxc: Remove minios-specific hack for generating .zlib.deps file
It's not needed if one relative path is replaced.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 20 Aug 2009 21:26:16 +0000 (22:26 +0100)]
libxenguest: Fix libbz2/liblzma dependency computation.
1. Create an empty dep file if neither lib is installed
2. Forcibly disable support for libs if building minios
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 20 Aug 2009 21:12:25 +0000 (22:12 +0100)]
domain builder: Implement bzip2 and LZMA loaders
Recent upstream kernels can be compressed using either gzip,
bzip2, or LZMA. However, the PV kernel loader in Xen currently only
understands gzip, and will fail on the other two types. The attached
patch implements kernel decompression for gzip, bzip2, and LZMA so
that kernels compressed with any of these methods can be launched.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 20 Aug 2009 20:15:24 +0000 (21:15 +0100)]
tools/flask/policy: Updates to policy and policy build infrastructure
The original xen policy infrastructure was based off of an early
version of refpolicy. Because of this there was a lot of cruft that
does not apply to building a policy for xen. This patch does several
things. First it cleans up the makefile as to remove many unnecessary
build targets. Second it fixes an issue that the policy build process
wasn't handling interface files properly. Third it pulls in the MLS
suppport functions from current ref policy and makes use of
them. Finally it updates the xen policy with new rules to address
changes in xen since the policy was last worked on, and provides
several new abstractions for creating domains.
Signed-off-by: David P. Quigley <dpquigl@tycho.nsa.gov>
Keir Fraser [Thu, 20 Aug 2009 17:27:31 +0000 (18:27 +0100)]
x86_64 hvm: Adjust COMPAT_VIRT_START for 32-bit HVM guests.
The PV limit should not apply as there is no M2P table mapped into an
HVM guest's virtual address space.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 20 Aug 2009 15:19:01 +0000 (16:19 +0100)]
xm-test: Fix testcase '11_block_attach_shared_dom0' for up-to date
linux kernels
New kernels have ext2 disabled by default. This fix uses ext3 for
testcase 11_block_attach_shared_dom0.
Signed-off-by: Andreas Florath <xen@flonatel.org>
Keir Fraser [Thu, 20 Aug 2009 15:17:16 +0000 (16:17 +0100)]
pygrub: Add password support
It basically checks for the presence of password line in grub.conf
of the guest image and if this line is present, it supports both clear
text and md5 versions of the password. Editing the grub entries and
command-line are disabled when some password is set in domain's
grub.conf file but the password was not entered yet. Also, new option
to press 'p' in interactive pygrub has been added to allow entering
the grub password. It's been tested on x86_64 with PV guests and was
working fine. Also, the countdown has been stopped after key was
pressed, ie. the user is probably editing the boot configuration.
Signed-off-by: Michal Novotny <minovotn@redhat.com>
Keir Fraser [Thu, 20 Aug 2009 15:15:52 +0000 (16:15 +0100)]
x86: shadow_alloc_p2m_page() should call shadow_prealloc() before shadow_alloc()
shadow_alloc_p2m_page() fails to call shadow_prealloc() before calling
shadow_alloc(). In certain conditions, notably when PoD is being
exercised, this may cause shadow_alloc() to fail, crashing Xen.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Thu, 20 Aug 2009 12:32:31 +0000 (13:32 +0100)]
x86 vmx: Update EIP when appropriate during task switch
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 20 Aug 2009 09:30:53 +0000 (10:30 +0100)]
Fix xapi xm-tests.
There were a couple of small bugs in the xapi xm-test:
o outdated XenAPI calls were removed from testcase
(02_xapi-vbd_basic)
o minor problem with XendLocalStorageRepository
is fixed (missed list_images() function - which
is moved from the XenQCoWStroageRepo to the common
base class XendStorageRepository)
which was detected running 02_xapi-vbd_basic.
o XenAPI session handling and connecting is fixed.
o 03_xapi-network_pos was rewritten and now uses
XenAPI.
Signed-off-by: Andreas Florath <xen@flonatel.org>
Keir Fraser [Thu, 20 Aug 2009 09:27:37 +0000 (10:27 +0100)]
xm-test: Add status section to xm-test/README
The resport functionality is not removed because there is the hope
that somebody sets up the server side infrastructure.
Signed-off-by: Andreas Florath <xen@flonatel.org>
Keir Fraser [Thu, 20 Aug 2009 09:16:58 +0000 (10:16 +0100)]
x86: Remove global percpu_mm_info structure, to make dataflow through
mm code clearer.
The FOREIGNDOM method was just confusing and pointless. The deferred
TLB flushing is of questionable value now that much automatic flushing has to be
synchronous to avoid guest SMP races.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 20 Aug 2009 07:26:51 +0000 (08:26 +0100)]
x86: teardown_msi_irq is not needed.
teardown_msi_irq logic is covered in destroy_irq,
so remove it to avoid freeing msi resource twice.
Signed-off-by: Xiantao Zhang<xiantao.zhang@intel.com>
Keir Fraser [Thu, 20 Aug 2009 07:26:16 +0000 (08:26 +0100)]
x86: calculate nr_irqs_gsi correctly.
Should be a typo, this issue is introduced by Cset20076,
and it may break VT-d device assignment.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Thu, 20 Aug 2009 07:25:41 +0000 (08:25 +0100)]
xend: Fix error caused by VT-d ACS patch.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Keir Fraser [Thu, 20 Aug 2009 07:23:33 +0000 (08:23 +0100)]
pygrub: Revert 19322:
3118041f2259, as it breaks timeout=0 behaviour
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 19 Aug 2009 16:00:26 +0000 (17:00 +0100)]
x86: Fix arch/x86/xen.lds dependencies.
gcc can get the dependency target name wrong (appends .o).
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 19 Aug 2009 13:23:30 +0000 (14:23 +0100)]
AMD IOMMU: support "passthrough" and "no-intremap" parameters.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Keir Fraser [Wed, 19 Aug 2009 13:22:52 +0000 (14:22 +0100)]
Update Xen Flask module to policy.24.
This is a back-port of the latest SELinux code to Xen, adjusted
for Xen coding style and interfaces. Unneeded functionality such
as most object context config data, handle_unknown, MLS field
defaulting, etc has been omitted.
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: George S. Coker, II <gscoker@alpha.ncsc.mil>
Keir Fraser [Wed, 19 Aug 2009 13:22:15 +0000 (14:22 +0100)]
xen-hvmctx: don't compile for ia64.
xen-hvmctx is a x86 specific tool so that it shouldn't compile for ia64.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Keir Fraser [Wed, 19 Aug 2009 13:21:56 +0000 (14:21 +0100)]
[IA64] define BYTES_PER_LONG to fix compilation error.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Keir Fraser [Wed, 19 Aug 2009 13:13:52 +0000 (14:13 +0100)]
x86 hvm: Clean up vlapic/vioapic/vmsi delivery.
In particular, avoid intermediate delivery bitmaps which restrict
number of vcpus supported.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 19 Aug 2009 12:17:41 +0000 (13:17 +0100)]
xen pm trace utility cleanup
xenpm trace utility gtraceview cleanup
- add gtraceview help info on how to get raw data by xentrace
- make trace_exit_reason compiled in non-debug mode. trace_exit_reason
can be enable/disabled by xentrace at runtime, so no need to disable
it at build time.
Signed-off-by: Yu Ke <ke.yu@intel.com>
Keir Fraser [Wed, 19 Aug 2009 12:16:50 +0000 (13:16 +0100)]
x86 hvm: Remove vendor-specific feature masking of 0x1:ECX.
Vendors are respecting each others bits.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Keir Fraser [Wed, 19 Aug 2009 12:12:16 +0000 (13:12 +0100)]
xend: passthrough: check if a device is behind PCIe switch that lacks ACS
Imagine a PCIe switch, which doesn't support ACS (Access Control
Services), has 2 downstream ports: A and B, according to PCIe spec,
the PCIe switch should directly route the transaction that is from A
and to a device under B -- the Root Complex and IOMMU engine are
bypassed -- this doesn't work at all in the case of hvm guest and can
even incur potential security issue, so we should not allow such kind
of device assignment.
If all the intermediate PCIe swiches between a device and Root Complex
support and enable ACS, we can safely asssign the device to guest.
Cc: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Keir Fraser [Wed, 19 Aug 2009 12:11:33 +0000 (13:11 +0100)]
hotplug scripts: better same_vm checks
currently the function same_vm in block-common.sh is the one
responsible for detecting if two block devices can be used at the same
time by two VMs. This can be allowed in few specific cases: when the
two VMs are actually the same VM and when the two VMs are the guest
and its stubdomain. We need to expand these exceptions to handle
properly save restore issues: this patch adds to the exceptions the
case when two VMs are the same VM because of save\restore races, and
when two VMs are the guest and the stubdomain of the previous guest,
again during save\restore.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 19 Aug 2009 12:02:31 +0000 (13:02 +0100)]
x86: miscellaneous emulator adjustments
Defer fail_if()-s as much as possible (in favor of possibly generating
exceptions), and avoid generating exceptions when not strictly
necessary.
Avoid fail_if()-s for simple return code checks (making the code that
used them consistent with other, longer existing code).
Eliminate redundant generate_exception_if()-s checking lock_prefix
(which is already covered by the general check prior to decoding
operands).
Also fix the testing code to add PROT_EXEC for the mapping that is
intended to have instruction executed from.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 19 Aug 2009 12:02:04 +0000 (13:02 +0100)]
x86-64: adjust emulation of control transfers
While Intel and AMD implementations differ in various respects when
it comes to non-default operand sizes of control transfer instructions
and segment register loads (lfs, lgs, lss), it seems to make senss to
(a) match their behavior if they agree and (b) prefer the more
permissive behavior if they don't agree:
- honor operand size overrides on near brances (AMD does, Intel
doesn't)
- honor operand size overrides on far branches (both Intel and AMD do)
- honor REX.W on far branches (Intel does, AMD doesn't except on far
returns)
- honor REX.W on lfs, lgs, and lss (Intel does, AMD doesn't)
Also, do not permit emulation of pushing/popping segment registers
other than fs and gs as well as that of les and lds (the latter are
particularly important due to the re-use of the respective opcodes as
VEX prefixes in AVX).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 19 Aug 2009 12:01:41 +0000 (13:01 +0100)]
x86: extend runstate area updates
In order to give guests a hint at whether their vCPU-s are currently
scheduled (so they can e.g. adapt their behavior in spin loops),
update
the run state area (if registered) also when de-scheduling a vCPU.
Also fix an oversight in the compat mode implementation of
VCPUOP_register_runstate_memory_area.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 19 Aug 2009 11:58:15 +0000 (12:58 +0100)]
x86: Fix max_gsi calculation on systems with discontiguous GSI space.
From: Steven Smith <steven.smith@citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 19 Aug 2009 11:55:15 +0000 (12:55 +0100)]
xm,xend: Remove tab indents
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Wed, 19 Aug 2009 11:54:43 +0000 (12:54 +0100)]
x86: Only allocate vpid for initialised vcpus.
Currently, 32 vpids are allocated for each
domain statically, it blocks to support more
vcpus for HVM domain, so remove the limit and
only allocate vpid for intilized vcpus. In this
way, vpid can be non-contiguous for vcpus of one
single domain.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Wed, 19 Aug 2009 11:53:46 +0000 (12:53 +0100)]
x86: Implement per-cpu vector for xen hypervisor
Since Xen and Linux has big differece in code base, it
is very hard to port Linux's patch and apply it to Xen
directly, so this patch only adopts core logic of Linux,
and make it work for Xen.
Key changes:
1. vector allocation algorithm
2. all IRQ chips' set_affinity logic
3. IRQ migration when cpu hot remove.
4. Break assumptions which depend on global vector policy.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Wed, 19 Aug 2009 11:53:04 +0000 (12:53 +0100)]
x86: Change Xen hypervisor's interrupt infrastructure
from vector-based to IRQ-based.
In per-cpu vector environment, vector space changes to
multi-demension resource, so vector number is not appropriate
to index irq_desc which stands for unique interrupt source. As
Linux does, irq number is chosen to index irq_desc. This patch
changes vector-based interrupt infrastructure to irq-based one.
Mostly, it follows upstream linux's changes, and some parts are
adapted for Xen.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Wed, 19 Aug 2009 11:52:38 +0000 (12:52 +0100)]
x86: Change nr_irqs to nr_irqs_gsi.
Currently, nr_irqs is only used for GSI irqs, change
the name to make its meaning more precise. And, also
this is the initial step to support irq allocation for
MSI interrupt source.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Sun, 16 Aug 2009 07:46:08 +0000 (08:46 +0100)]
gdbstub: Remove noisy message on every gdbstub entry.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sun, 16 Aug 2009 07:45:04 +0000 (08:45 +0100)]
stubdoms: parse bridge informations
Currently the stubdom-dm script doesn't read the bridge of a vif
on xenstore, therefore all the vifs assigned to the stubdom always
belong to default bridge. This patch changes the behavior reading the
bridge from xenstore and adding the bridge to the stubdom config
file.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Sun, 16 Aug 2009 07:43:50 +0000 (08:43 +0100)]
Revert 20066:
135b350496fb
Keir Fraser [Fri, 14 Aug 2009 16:26:23 +0000 (17:26 +0100)]
xen-hvmctx: a tool to print the HVM state of a running domain
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 14 Aug 2009 16:10:11 +0000 (17:10 +0100)]
xend: VBD QoS policy bits
Add the ability to define VBD QoS policy in the xend layer.
Consider the following vbd entry:
vbd = [
'phy:/dev/server/virtualmachine1-disk,xvda1,w,credit=3D5000/s@50ms',
]
This means that a VM may perform 5000 I/O operations per second, with
credit being replenished every 50 milliseconds.
The 'credit' xenstore value is by the blkback driver to ratelimit I/O
operations for the specific device.
Signed-off-by: William Pitcock <nenolod@dereferenced.org>
Keir Fraser [Fri, 14 Aug 2009 16:09:39 +0000 (17:09 +0100)]
x86 mce: move mce quirks into separate files
Quirk handling is designed to easily add more quirks when needed
w/o messing around in the normal mce code.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Fri, 14 Aug 2009 16:08:38 +0000 (17:08 +0100)]
xsm/flask: Fix AVC audit message format
Fix formatting of Flask AVC audit messages so that existing
policy tools can parse them. After applying,
'xm dmesg | audit2allow' yields the expected result.
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: George S. Coker, II <gscoker@alpha.ncsc.mil>
Keir Fraser [Fri, 14 Aug 2009 16:08:12 +0000 (17:08 +0100)]
xsm/flask: Fix sidtab locking bug
We do not need to use the _irqsave/irqrestore forms of spin locking
within the sidtab in Xen's XSM Flask module, and doing so triggers a
BUG_ON() within check_lock() when we subsequently call xmalloc().
This was preventing Xen from booting with XSM/Flask enabled if built
with debug=y. It appears that this broke upon the changes to xmalloc
in changeset 18379:
14a9a1629590.
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: George S. Coker, II <gscoker@alpha.ncsc.mil>
Keir Fraser [Fri, 14 Aug 2009 16:07:23 +0000 (17:07 +0100)]
AMD IOMMU: Destroy passthru guests when IO pagetable allocation fails
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Acked-by: Wei Huang <wei.huang2@amd.com>
Keir Fraser [Fri, 14 Aug 2009 11:26:35 +0000 (12:26 +0100)]
x86: cleanup rdmsr/wrmsr
Use a 64bit value instead of extracting/merging two 32bit values.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 14 Aug 2009 09:59:13 +0000 (10:59 +0100)]
x86 mce: make debug messages less noisy
On guest MCE read only print debug code when
a non-zero value has been read. Xen is too
noisy, otherwise.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Fri, 14 Aug 2009 09:58:32 +0000 (10:58 +0100)]
VMX: issue an NMI rather than just calling the NMI handler
when the VMEXIT code indicates that an NMI has been raised.
Otherwise we might hit a real NMI while in the handler.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>